Multi-Advisor Reinforcement Learning

نویسندگان

  • Romain Laroche
  • Mehdi Fatemi
  • Joshua Romoff
  • Harm van Seijen
چکیده

This article deals with a novel branch of Separation of Concerns, called Multi-Advisor Reinforcement Learning (MAd-RL), where a single-agent RL problem is distributed to n learners, called advisors. Each advisor tries to solve the problem with a different focus. Their advice is then communicated to an aggregator, which is in control of the system. For the local training, three off-policy bootstrapping methods are proposed and analysed: local-max bootstraps with the local greedy action, rand-policy bootstraps with respect to the random policy, and agg-policy bootstraps with respect to the aggregator’s greedy policy. MAd-RL is positioned as a generalisation of Reinforcement Learning with Ensemble methods. An experiment is held on a simplified version of the Ms. Pac-Man Atari game. The results confirm the theoretical relative strengths and weaknesses of each method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Cognitive Robot Collaborative Reinforcement Learning Algorithm

A cognitive collaborative reinforcement learning algorithm (CCRL) that incorporates an advisor into the learning process is developed to improve supervised learning. An autonomous learner is enabled with a self awareness cognitive skill to decide when to solicit instructions from the advisor. The learner can also assess the value of advice, and accept or reject it. The method is evaluated for r...

متن کامل

Low-Area/Low-Power CMOS Op-Amps Design Based on Total Optimality Index Using Reinforcement Learning Approach

This paper presents the application of reinforcement learning in automatic analog IC design. In this work, the Multi-Objective approach by Learning Automata is evaluated for accommodating required functionalities and performance specifications considering optimal minimizing of MOSFETs area and power consumption for two famous CMOS op-amps. The results show the ability of the proposed method to ...

متن کامل

Utilizing negative policy information to accelerate reinforcement learning

ACKNOWLEDGEMENTS One consequence of my long tenure at Georgia Tech has been the opportunity to get to know a large number of exceptional people. I'm thankful for tremendous support from some particularly exceptional people: advisor Charles Isbell, a gentleman and a scholar, who gave me the opportunity to think outside the box; and my adoptive advisor Andrea Thomaz, who helped me see it's also q...

متن کامل

Morphogenetic Computing and Reinforcement Learning for Multi-agent Systems

There are some major limitations to conventional rule-based approaches for multi-robot system applications. First, as the complexity and scale of swarm robots grow, most rule-based methods are not able to accomplish the tasks due to the extensive communication and computational loads. Second, in most of the rule-based control systems, certain properties of the systems are predefined by the desi...

متن کامل

ADVISOR: A Machine Learning Architecture for Intelligent Tutor Construction

We have constructed ADVISOR, a two-agent machine learning architecture for intelligent tutoring systems (ITS). The purpose of this architecture is to centralize the reasoning of an ITS into a single component to allow customization of teaching goals and to simplify improving the ITS. The first agent is responsible for learning a model of how students perform using the tutor in a variety of cont...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1704.00756  شماره 

صفحات  -

تاریخ انتشار 2017